Load the necessary libraries
library(car) # for regression diagnostics
library(broom) # for tidy output
library(ggfortify) # for model diagnostics
library(sjPlot) # for outputs
library(knitr) # for kable
library(effects) # for partial effects plots
library(emmeans) # for estimating marginal means
library(ggeffects) # for plotting marginal means
library(MASS) # for glm.nb
library(MuMIn) # for AICc
library(tidyverse) # for data wrangling
library(modelr) # for auxillary modelling functions
library(performance) # for residuals diagnostics
library(see) # for plotting residuals
library(DHARMa) # for residual diagnostics plots
library(patchwork) # grid of plots
library(scales) # for more scales
Here is a modified example from Quinn and Keough (2002). Day and Quinn (1989) described an experiment that examined how rock surface type affected the recruitment of barnacles to a rocky shore. The experiment had a single factor, surface type, with 4 treatments or levels: algal species 1 (ALG1), algal species 2 (ALG2), naturally bare surfaces (NB) and artificially scraped bare surfaces (S). There were 5 replicate plots for each surface type and the response (dependent) variable was the number of newly recruited barnacles on each plot after 4 weeks.
Six-plated barnacle
Format of day.csv data files
| TREAT | BARNACLE |
|---|---|
| ALG1 | 27 |
| .. | .. |
| ALG2 | 24 |
| .. | .. |
| NB | 9 |
| .. | .. |
| S | 12 |
| .. | .. |
| TREAT | Categorical listing of surface types. ALG1 = algal species 1, ALG2 = algal species 2, NB = naturally bare surface, S = scraped bare surface. |
| BARNACLE | The number of newly recruited barnacles on each plot after 4 weeks. |
As we are going to treat Treatment as a categorical predictor, we will specifically declare it as such straight after importing the data.
day <- read_csv("../public/data/day.csv", trim_ws = TRUE)
glimpse(day)
## Rows: 20
## Columns: 2
## $ TREAT <chr> "ALG1", "ALG1", "ALG1", "ALG1", "ALG1", "ALG2", "ALG2", "ALG2…
## $ BARNACLE <dbl> 27, 19, 18, 23, 25, 24, 33, 27, 26, 32, 9, 13, 17, 14, 22, 12…
day <- day %>%
mutate(TREAT = factor(TREAT))
Model formula: \[ y_i \sim{} \mathcal{Pois}(\lambda_i)\\ \mu_i = \boldsymbol{\beta} \bf{X_i} \]
where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the intercept and treatment contrasts for the effects of Treatment on barnacle recruitment.
ggplot(day, aes(y = BARNACLE, x = TREAT)) +
geom_boxplot() +
geom_point(color = "red")
ggplot(day, aes(y = BARNACLE, x = TREAT)) +
geom_violin() +
geom_point(color = "red")
Categorical predictors are actually treated as multiple regression. We start by dummy coding the levels into separate vectors containing only 0’s and 1’s. The following fabricated example should illustrate this process. The first column represent a response (the numbers are not important for the dummy coding and thus they are excluded from the example). Column X represents a categorical variable with three levels (A, B and C). This could be dummy coded into three dummies (one for each level of the categorical predictor).
| Y | X | Dummy 1 | Dummy 2 | Dummy 3 |
|---|---|---|---|---|
| . | A | 1 | 0 | 0 |
| . | A | 1 | 0 | 0 |
| . | B | 0 | 1 | 0 |
| . | B | 0 | 1 | 0 |
| . | C | 0 | 0 | 1 |
| . | C | 0 | 0 | 1 |
So, we could construct the linear predictor of a regression model as:
\[ log(\mu_i) = \beta_0 + \beta_1 D1 + \beta_2 D2 + \beta_3 D3 \]
however, this would be an overparameterized model as there are four effects parameters to estimate from only three chunks of data (one per level of the categorical predictor).
One option would be to drop the intercept:
\[ log(\mu_i) = \beta_1 D1 + \beta_2 D2 + \beta_3 D3 \] Now each \(\beta\) just represents the mean of each level of the categorical predictor. Although this is perfectly valid, it is not a common way to express the model as it does not include any effects (differences).
Another option is to re-parameterise this model such that the intercept represents the mean of the first level of the categorical predictor and then two additional \(\beta\) parameters represent the differences between the first level and each of the remaining levels. This parameterisation is called treatment contrasts.
When we fit treatment contrasts, the first dummy variable is populated with 1’s.
| Y | X | Dummy 1 | Dummy 2 | Dummy 3 |
|---|---|---|---|---|
| . | A | 1 | 0 | 0 |
| . | A | 1 | 0 | 0 |
| . | B | 1 | 1 | 0 |
| . | B | 1 | 1 | 0 |
| . | C | 1 | 0 | 1 |
| . | C | 1 | 0 | 1 |
\[ log(\mu_i) = \beta_0 + \beta_2 D2 + \beta_3 D3 \]
Before actually fitting the model for our actual data, it might be instructive to estimate the effects via matrix multiplication so as to see the treatment contrasts in action. Note, this approach is essentially Ordinary Least Squares regression, as and as such, assumes normality and homogeneity of variance. As the response observations are counts, a Gaussian distribution might bot be the most appropriate choice. Nevertheless, it will suffice for the purpose of illustration.
# Effects model
## start by dummy coding the categorical predictor and expressing
## the treatment effects as a model matrix.
Xmat <- model.matrix(~TREAT, data = day)
Xmat %>% head()
## (Intercept) TREATALG2 TREATNB TREATS
## 1 1 0 0 0
## 2 1 0 0 0
## 3 1 0 0 0
## 4 1 0 0 0
## 5 1 0 0 0
## 6 1 1 0 0
## latex-math-preview-expression
# solve(X'X)X'Y
solve(t(Xmat) %*% Xmat) %*% t(Xmat) %*% day$BARNACLE
## [,1]
## (Intercept) 22.4
## TREATALG2 6.0
## TREATNB -7.4
## TREATS -9.2
Now lets fit the model. As the response observations are counts, a Poisson distribution would be an obvious choice for the model distribution. That said, it might be useful to fit the model with a Gaussian distribution with an Identity link (no transformation) to assist us in interpreting the estimated coefficients. TO emphasise, the Gaussian model will only be used as a learning aid, as a model, it does not have as much merit as a Poisson model.
day.glm <- glm(BARNACLE ~ TREAT, data = day, family = "gaussian")
day.glm1 <- glm(BARNACLE ~ TREAT, data = day, family = "poisson")
day.glm %>% autoplot()
Conclusion:
day.glm1 %>% autoplot()
Conclusion:
day.glm %>% influence.measures()
## Influence measures of
## glm(formula = BARNACLE ~ TREAT, family = "gaussian", data = day) :
##
## dfb.1_ dfb.TREATA dfb.TREATN dfb.TREATS dffit cov.r cook.d hat inf
## 1 6.05e-01 -4.28e-01 -4.28e-01 -4.28e-01 0.6053 1.115 0.08900 0.2
## 2 -4.38e-01 3.10e-01 3.10e-01 3.10e-01 -0.4378 1.326 0.04862 0.2
## 3 -5.77e-01 4.08e-01 4.08e-01 4.08e-01 -0.5766 1.152 0.08143 0.2
## 4 7.54e-02 -5.33e-02 -5.33e-02 -5.33e-02 0.0754 1.608 0.00151 0.2
## 5 3.31e-01 -2.34e-01 -2.34e-01 -2.34e-01 0.3313 1.442 0.02843 0.2
## 6 7.16e-17 -4.08e-01 -5.34e-17 -4.47e-17 -0.5766 1.152 0.08143 0.2
## 7 -7.51e-17 4.28e-01 5.60e-17 4.70e-17 0.6053 1.115 0.08900 0.2
## 8 2.19e-17 -1.25e-01 -1.63e-17 -1.37e-17 -0.1766 1.565 0.00824 0.2
## 9 3.79e-17 -2.16e-01 -2.82e-17 -2.37e-17 -0.3051 1.467 0.02423 0.2
## 10 -5.77e-17 3.29e-01 4.30e-17 3.61e-17 0.4650 1.293 0.05451 0.2
## 11 1.47e-17 -4.15e-17 -5.78e-01 0.00e+00 -0.8180 0.839 0.15141 0.2
## 12 4.54e-18 -1.28e-17 -1.79e-01 0.00e+00 -0.2533 1.512 0.01682 0.2
## 13 -4.54e-18 1.28e-17 1.79e-01 0.00e+00 0.2533 1.512 0.01682 0.2
## 14 2.25e-18 -6.38e-18 -8.90e-02 0.00e+00 -0.1259 1.591 0.00421 0.2
## 15 -1.77e-17 5.00e-17 6.98e-01 0.00e+00 0.9866 0.643 0.20609 0.2
## 16 -1.12e-18 -4.95e-18 8.12e-18 -1.07e-01 -0.1512 1.579 0.00606 0.2
## 17 -5.15e-18 -2.27e-17 3.73e-17 -4.91e-01 -0.6937 0.998 0.11373 0.2
## 18 1.69e-18 7.46e-18 -1.22e-17 1.61e-01 0.2276 1.532 0.01363 0.2
## 19 7.06e-18 3.12e-17 -5.11e-17 6.73e-01 0.9515 0.681 0.19448 0.2
## 20 -2.07e-18 -9.14e-18 1.50e-17 -1.97e-01 -0.2791 1.490 0.02036 0.2
day.glm1 %>% influence.measures()
## Influence measures of
## glm(formula = BARNACLE ~ TREAT, family = "poisson", data = day) :
##
## dfb.1_ dfb.TREATA dfb.TREATN dfb.TREATS dffit cov.r cook.d hat inf
## 1 5.08e-01 -3.80e-01 -3.22e-01 -3.09e-01 0.5078 1.240 0.07380 0.2
## 2 -3.93e-01 2.94e-01 2.49e-01 2.39e-01 -0.3929 1.377 0.04032 0.2
## 3 -5.20e-01 3.89e-01 3.30e-01 3.17e-01 -0.5203 1.224 0.06752 0.2
## 4 6.59e-02 -4.93e-02 -4.17e-02 -4.01e-02 0.0659 1.611 0.00126 0.2
## 5 2.84e-01 -2.13e-01 -1.80e-01 -1.73e-01 0.2844 1.486 0.02358 0.2
## 6 -3.01e-17 -3.02e-01 2.01e-17 3.53e-17 -0.4548 1.305 0.05326 0.2
## 7 2.98e-17 2.99e-01 -1.99e-17 -3.50e-17 0.4508 1.310 0.05821 0.2
## 8 -9.16e-18 -9.20e-02 6.13e-18 1.08e-17 -0.1386 1.585 0.00539 0.2
## 9 -1.59e-17 -1.60e-01 1.06e-17 1.86e-17 -0.2403 1.522 0.01585 0.2
## 10 2.32e-17 2.33e-01 -1.55e-17 -2.72e-17 0.3511 1.422 0.03565 0.2
## 11 -2.42e-17 5.04e-17 -7.58e-01 3.04e-17 -0.9794 0.651 0.18750 0.2
## 12 -6.90e-18 1.43e-17 -2.16e-01 8.65e-18 -0.2787 1.491 0.02083 0.2
## 13 6.59e-18 -1.37e-17 2.06e-01 -8.26e-18 0.2663 1.501 0.02083 0.2
## 14 -3.38e-18 7.04e-18 -1.06e-01 4.24e-18 -0.1366 1.586 0.00521 0.2
## 15 2.45e-17 -5.10e-17 7.66e-01 -3.07e-17 0.9896 0.640 0.25521 0.2
## 16 -1.10e-18 -3.52e-18 -1.01e-17 -1.39e-01 -0.1758 1.566 0.00852 0.2
## 17 -5.54e-18 -1.78e-17 -5.08e-17 -7.03e-01 -0.8869 0.756 0.16004 0.2
## 18 1.59e-18 5.11e-18 1.46e-17 2.02e-01 0.2552 1.511 0.01918 0.2
## 19 6.41e-18 2.06e-17 5.88e-17 8.14e-01 1.0265 0.601 0.27367 0.2
## 20 -2.06e-18 -6.62e-18 -1.89e-17 -2.62e-01 -0.3301 1.443 0.02865 0.2
day.glm %>% performance::check_model()
Conclusion:
day.glm1 %>% performance::check_model()
Conclusion:
day.resids <- day.glm %>% simulateResiduals(plot = TRUE)
day.resids %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## ratioObsSim = 0.93311, p-value = 0.68
## alternative hypothesis: two.sided
Conclusion:
day.resids <- day.glm1 %>% simulateResiduals(plot = TRUE)
day.resids %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## ratioObsSim = 0.91946, p-value = 0.68
## alternative hypothesis: two.sided
Conclusion:
## overdispersion
1 - pchisq(day.glm$deviance, day.glm$df.residual)
## [1] 0
day.glm$deviance / day.glm$df.residual
## [1] 18.575
Conclusion:
1 - pchisq(day.glm1$deviance, day.glm1$df.residual)
## [1] 0.3719059
day.glm1$deviance / day.glm1$df.residual
## [1] 1.075851
Conclusion:
Conclusion:
day.glm %>% plot_model(type = "eff", show.data = TRUE)
## $TREAT
day.glm1 %>% plot_model(type = "eff", show.data = TRUE)
## $TREAT
day.glm %>%
allEffects() %>%
plot()
day.glm1 %>%
allEffects() %>%
plot()
allEffects(day.glm1, transformation = NULL) %>% plot()
Predicted values are always on the response scale
day.glm %>%
ggpredict() %>%
plot(add.data = TRUE, jitter = FALSE)
## $TREAT
day.glm1 %>%
ggpredict() %>%
plot(add.data = TRUE, jitter = FALSE)
## $TREAT
## back.transform only applies to response transformations, not link functions
## day.glm1 %>% ggpredict(back.transform = FALSE) %>% plot()
Predicted values are always on the response scale
day.glm %>%
ggemmeans(~TREAT) %>%
plot(add.data = TRUE, jitter = FALSE)
day.glm1 %>%
ggemmeans(~TREAT) %>%
plot(add.data = TRUE, jitter = FALSE)
## back.transform only applies to response transformations, not link functions
## day.glm1 %>% ggemmeans(back.transform = FALSE) %>% plot()
day.glm %>% summary()
##
## Call:
## glm(formula = BARNACLE ~ TREAT, family = "gaussian", data = day)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -6.00 -2.65 -1.10 2.85 7.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.400 1.927 11.622 3.27e-09 ***
## TREATALG2 6.000 2.726 2.201 0.04275 *
## TREATNB -7.400 2.726 -2.715 0.01530 *
## TREATS -9.200 2.726 -3.375 0.00386 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 18.575)
##
## Null deviance: 1033.8 on 19 degrees of freedom
## Residual deviance: 297.2 on 16 degrees of freedom
## AIC: 120.73
##
## Number of Fisher Scoring iterations: 2
Conclusions:
day.glm %>% confint()
## 2.5 % 97.5 %
## (Intercept) 18.622300 26.177700
## TREATALG2 0.657525 11.342475
## TREATNB -12.742475 -2.057525
## TREATS -14.542475 -3.857525
day.glm %>% tidy(conf.int = TRUE)
# warning this is only appropriate for html output
day.glm %>% sjPlot::tab_model(show.se = TRUE, show.aic = TRUE)
| Â | BARNACLE | |||
|---|---|---|---|---|
| Predictors | Estimates | std. Error | CI | p |
| (Intercept) | 22.40 | 1.93 | 18.62 – 26.18 | <0.001 |
| TREAT [ALG2] | 6.00 | 2.73 | 0.66 – 11.34 | 0.043 |
| TREAT [NB] | -7.40 | 2.73 | -12.74 – -2.06 | 0.015 |
| TREAT [S] | -9.20 | 2.73 | -14.54 – -3.86 | 0.004 |
| Observations | 20 | |||
| R2 Nagelkerke | 1.000 | |||
| AIC | 120.731 | |||
day.glm1 %>% summary()
##
## Call:
## glm(formula = BARNACLE ~ TREAT, family = "poisson", data = day)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6748 -0.6522 -0.2630 0.5699 1.7380
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.10906 0.09449 32.903 < 2e-16 ***
## TREATALG2 0.23733 0.12638 1.878 0.060387 .
## TREATNB -0.40101 0.14920 -2.688 0.007195 **
## TREATS -0.52884 0.15518 -3.408 0.000654 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 54.123 on 19 degrees of freedom
## Residual deviance: 17.214 on 16 degrees of freedom
## AIC: 120.34
##
## Number of Fisher Scoring iterations: 4
Conclusions:
day.glm1 %>% confint()
## 2.5 % 97.5 %
## (Intercept) 2.917958584 3.2887098
## TREATALG2 -0.009471895 0.4865512
## TREATNB -0.696806649 -0.1108759
## TREATS -0.837588039 -0.2280991
day.glm1 %>%
confint() %>%
exp()
## 2.5 % 97.5 %
## (Intercept) 18.5034756 26.8082525
## TREATALG2 0.9905728 1.6266964
## TREATNB 0.4981736 0.8950498
## TREATS 0.4327530 0.7960454
day.glm1 %>% tidy(conf.int = TRUE)
day.glm1 %>% tidy(conf.int = TRUE, exponentiate = TRUE)
day.glm1 %>%
tidy(conf.int = TRUE, exponentiate = TRUE) %>%
kable()
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 22.4000000 | 0.0944911 | 32.903208 | 0.0000000 | 18.5034756 | 26.8082525 |
| TREATALG2 | 1.2678571 | 0.1263757 | 1.877957 | 0.0603870 | 0.9905728 | 1.6266964 |
| TREATNB | 0.6696429 | 0.1492042 | -2.687664 | 0.0071954 | 0.4981736 | 0.8950498 |
| TREATS | 0.5892857 | 0.1551776 | -3.407993 | 0.0006544 | 0.4327530 | 0.7960454 |
# warning this is only appropriate for html output
day.glm1 %>% sjPlot::tab_model(show.se = TRUE, show.aic = TRUE)
| Â | BARNACLE | |||
|---|---|---|---|---|
| Predictors | Incidence Rate Ratios | std. Error | CI | p |
| (Intercept) | 22.40 | 2.12 | 18.50 – 26.81 | <0.001 |
| TREAT [ALG2] | 1.27 | 0.16 | 0.99 – 1.63 | 0.060 |
| TREAT [NB] | 0.67 | 0.10 | 0.50 – 0.90 | 0.007 |
| TREAT [S] | 0.59 | 0.09 | 0.43 – 0.80 | 0.001 |
| Observations | 20 | |||
| R2 Nagelkerke | 0.902 | |||
| AIC | 120.340 | |||
The estimated coefficients as presented in the summary tables above highlight very specific comparisons. However, there are other possible comparisons that we might be interested in. For example, whist the treatment effects compare each of the substrate types against the first level (ALG1), we might also be interested in the differences between (for example) the two bare substrates (NB and S).
To get at more of the comparisons we have two broad approaches:
We will now only pursue the Poisson model.
day.glm1 %>% emmeans(pairwise ~ TREAT, type = "response")
## $emmeans
## TREAT rate SE df asymp.LCL asymp.UCL
## ALG1 22.4 2.12 Inf 18.6 27.0
## ALG2 28.4 2.38 Inf 24.1 33.5
## NB 15.0 1.73 Inf 12.0 18.8
## S 13.2 1.62 Inf 10.4 16.8
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## $contrasts
## contrast ratio SE df z.ratio p.value
## ALG1 / ALG2 0.789 0.0997 Inf -1.878 0.2375
## ALG1 / NB 1.493 0.2228 Inf 2.688 0.0362
## ALG1 / S 1.697 0.2633 Inf 3.408 0.0037
## ALG2 / NB 1.893 0.2703 Inf 4.472 <.0001
## ALG2 / S 2.152 0.3205 Inf 5.143 <.0001
## NB / S 1.136 0.1918 Inf 0.757 0.8735
##
## P value adjustment: tukey method for comparing a family of 4 estimates
## Tests are performed on the log scale
day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
pairs()
## contrast ratio SE df z.ratio p.value
## ALG1 / ALG2 0.789 0.0997 Inf -1.878 0.2375
## ALG1 / NB 1.493 0.2228 Inf 2.688 0.0362
## ALG1 / S 1.697 0.2633 Inf 3.408 0.0037
## ALG2 / NB 1.893 0.2703 Inf 4.472 <.0001
## ALG2 / S 2.152 0.3205 Inf 5.143 <.0001
## NB / S 1.136 0.1918 Inf 0.757 0.8735
##
## P value adjustment: tukey method for comparing a family of 4 estimates
## Tests are performed on the log scale
day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
pairs() %>%
confint()
day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
pairs() %>%
summary(infer = TRUE)
## Lets store this for late
day.pairwise <- day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
pairs() %>%
summary(infer = TRUE) %>%
as.data.frame()
Conclusions:
Define your own
Compare:
| Levels | Alg1 vs Alg2 | NB vs S | Alg vs Bare |
|---|---|---|---|
| Alg1 | 1 | 0 | 0.5 |
| Alg2 | -1 | 0 | 0.5 |
| NB | 0 | 1 | -0.5 |
| S | 0 | -1 | -0.5 |
## Alg1_Alg2 NB_S Alg_Bare
## ALG1 1 0 0.5
## ALG2 -1 0 0.5
## NB 0 1 -0.5
## S 0 -1 -0.5
cmat <- (cbind(
"Alg1_Alg2" = c(1, -1, 0, 0),
"NB_S" = c(0, 0, 1, -1),
"Alg_Bare" = c(0.5, 0.5, -0.5, -0.5)
))
cmat
## Alg1_Alg2 NB_S Alg_Bare
## [1,] 1 0 0.5
## [2,] -1 0 0.5
## [3,] 0 1 -0.5
## [4,] 0 -1 -0.5
It is important that each of the comparisons are independent of one another. We can test this by exploring the cross products of the contrast matrix. Independent comparisons will have a cross product of 0. So if all of the values in the lower (and equivalently, the upper) triangle of the cross product matrix are 0, then all comparisons are independent. Those that are not 0, indicate a lack of independence and a need to remove one of the comparisons from the set of contrasts. Note, we can ignore the diagonal values.
crossprod(cmat)
## Alg1_Alg2 NB_S Alg_Bare
## Alg1_Alg2 2 0 0
## NB_S 0 2 0
## Alg_Bare 0 0 1
Conclusions:
day.glm1 %>% emmeans(~TREAT, contr = list(TREAT = cmat), type = "response")
## $emmeans
## TREAT rate SE df asymp.LCL asymp.UCL
## ALG1 22.4 2.12 Inf 18.6 27.0
## ALG2 28.4 2.38 Inf 24.1 33.5
## NB 15.0 1.73 Inf 12.0 18.8
## S 13.2 1.62 Inf 10.4 16.8
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## $contrasts
## contrast ratio SE df z.ratio p.value
## TREAT.Alg1_Alg2 0.789 0.0997 Inf -1.878 0.0604
## TREAT.NB_S 1.136 0.1918 Inf 0.757 0.4488
## TREAT.Alg_Bare 1.792 0.1890 Inf 5.536 <.0001
##
## Tests are performed on the log scale
day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
contrast(method = list(TREAT = cmat)) %>%
summary(infer = TRUE)
## what about absolute differences
day.glm1 %>%
emmeans(~TREAT, type = "link") %>%
regrid() %>%
contrast(method = list(TREAT = cmat)) %>%
summary(infer = TRUE)
Conclusions:
newdata <- day.glm1 %>%
emmeans(~TREAT, type = "response") %>%
as.data.frame()
newdata
## A quick version
ggplot(newdata, aes(y = rate, x = TREAT)) +
geom_pointrange(aes(ymin = asymp.LCL, ymax = asymp.UCL)) +
theme_classic()
A more thorough version that also includes the Tukey’s test
g1 <- ggplot(newdata, aes(y = rate, x = TREAT)) +
geom_pointrange(aes(ymin = asymp.LCL, ymax = asymp.UCL)) +
geom_point() +
scale_x_discrete("Treatment",
breaks = c("ALG1", "ALG2", "NB", "S"),
labels = c("Algae spp 1", "Algae spp 2", "Naturally bare", "Scraped bare")
) +
scale_y_continuous(expression(Number ~ of ~ newly ~ recruited ~ barnacles ~ (cm^2))) +
theme_classic()
g2 <- day.pairwise %>%
ggplot(aes(y = ratio, x = contrast)) +
geom_hline(yintercept = 1, linetype = "dashed") +
geom_pointrange(aes(ymin = asymp.LCL, ymax = asymp.UCL)) +
scale_y_continuous(trans = scales::log2_trans(), breaks = scales::breaks_log(base = 2)) +
coord_flip(ylim = c(0.25, 4)) +
theme_classic()
g1 + g2
newdata <- with(day, data.frame(TREAT = levels(TREAT)))
Xmat <- model.matrix(~TREAT, newdata)
coefs <- coef(day.glm1)
fit <- as.vector(coefs %*% t(Xmat))
se <- sqrt(diag(Xmat %*% vcov(day.glm1) %*% t(Xmat)))
q <- qt(0.975, df = df.residual(day.glm1))
newdata <- newdata %>%
mutate(
Fit = exp(fit),
Lower = exp(fit - q * se),
Upper = exp(fit + q * se)
)
## A quick version
ggplot(newdata, aes(y = Fit, x = TREAT)) +
geom_pointrange(aes(ymin = Lower, ymax = Upper)) +
theme_classic()
A more thorough version that also includes the Tukey’s test
g1 <- ggplot(newdata, aes(y = Fit, x = TREAT)) +
geom_pointrange(aes(ymin = Lower, ymax = Upper)) +
geom_point() +
scale_x_discrete("Treatment",
breaks = c("ALG1", "ALG2", "NB", "S"),
labels = c("Algae spp 1", "Algae spp 2", "Naturally bare", "Scraped bare")
) +
scale_y_continuous(expression(Number ~ of ~ newly ~ recruited ~ barnacles ~ (cm^2))) +
theme_classic()
g2 <- day.pairwise %>%
ggplot(aes(y = ratio, x = contrast)) +
geom_hline(yintercept = 1, linetype = "dashed") +
geom_pointrange(aes(ymin = asymp.LCL, ymax = asymp.UCL)) +
scale_y_continuous(trans = scales::log2_trans(), breaks = scales::breaks_log(base = 2)) +
coord_flip(ylim = c(0.25, 4)) +
theme_classic()
g1 + g2